-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feature: Add common TranscriptionModel interface for audio transcription #1484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: Add common TranscriptionModel interface for audio transcription #1484
Conversation
- Created TranscriptionModel interface that extends Model<AudioTranscriptionPrompt, AudioTranscriptionResponse> - Implemented `call(AudioTranscriptionPrompt)` method for better compatibility between OpenAI and Azure OpenAI transcription models - Added default convenience methods for handling Resource and AudioTranscriptionOptions to return transcription as a String
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty much like what I had in mind. Well done.
Meanwhile, as an enrichment to transcription take a look at #1278 |
@@ -0,0 +1,22 @@ | |||
package org.springframework.ai.model; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this interface be in the package org.springframework.ai.model.audio.transcription
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it should.
Model interfaces should be placed in packages that reflect their functional domain:
-
For single-level domains:
org.springframework.ai.<domain>
-
For hierarchical domains:
org.springframework.ai.<category>.<subdomain>
Model Interface | Package Location |
---|---|
EmbeddingModel |
org.springframework.ai.embedding |
ModerationModel |
org.springframework.ai.moderation |
TextToSpeechModel |
org.springframework.ai.audio.tts |
Tests should be added too. |
Is there another transcription model we can use to verify the abstraction? |
I've added the additional tests to check the default methods etc. Merged in 4cf2377 Thanks @mudabirhussain and others. |
call(AudioTranscriptionPrompt)
method for better compatibility between OpenAI and Azure OpenAI transcription modelsResolution of this opened issue: #1478